tcsh and sh side-by-side on a Mac OS X desktop. |
|
Original author(s) | Bill Joy |
---|---|
Initial release | 1978 |
Stable release | tcsh 6.17.00 / July 10, 2009 |
Written in | C |
Operating system | BSD, UNIX, Linux, Mac OS X |
Type | Unix Shell |
License | BSD license |
The C shell (csh or the improved version, tcsh, on most machines) is a Unix shell that was created by Bill Joy while a graduate student at University of California, Berkeley in the late 1970s. It has been distributed widely, beginning with the 2BSD release of the BSD Unix system that Joy began distributing in 1978.[1][2] Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.[3]
The C shell is a command processor that's typically run in a text window, allowing the user to type commands which cause actions. The C shell can also read commands from a file, called a script. Like all Unix shells, it supports filename wildcarding, piping, here documents, command substitution, variables and control structures for condition-testing and iteration. What differentiated the C shell, especially in the 1980s, were its interactive features and overall style. Its new features made it easier and faster to use. The overall style of the language looked more like C and was seen as more readable.
Today, csh on most machines is actually tcsh, an improved version of csh. As a practical matter, tcsh is csh: One file containing the tcsh executable has links to it as both "csh" and "tcsh" so that either name refers to the same improved version of the C shell.
tcsh added filename and command completion and command line editing concepts borrowed from the Tenex system, which is where the "t" came from.[4] Because it only added functionality and didn't change what was there, tcsh remained backward compatible[5] with the original C shell. And though it started as a side branch from the original source tree Joy had created, tcsh is now the main branch for ongoing development. tcsh is very stable but new releases continue to appear roughly once a year, consisting mostly of minor bug fixes.[6]
Contents |
The main design objectives for the C shell were that it should look more like the C programming language and that it should be better for interactive use.
The Unix system had been written almost exclusively in C, so the C shell's first objective was a command language that was more stylistically consistent with the rest of the system. The keywords, the use of parentheses and the C shell's built-in expression grammar and support for arrays were all strongly influenced by C.
By today's standards, csh may not seem particularly more C-like than many other popular scripting languages. But through the 80s and 90s, the difference was seen as striking, particularly when compared to sh, the then-dominant shell written by Stephen Bourne at AT&T. This example illustrates the C shell's more conventional expression operators and syntax.
#!/bin/sh if [ $days -gt 365 ] then echo This is over a year. fi |
#!/bin/csh if ( $days > 365 ) then echo This is over a year. endif |
sh lacked an expression grammar. The square bracketed condition had to be evaluated by the slower means of running the external test program. sh's if command took its argument words as a new command to be run as a child process. If the child exited with a zero return code, sh would look for a then clause (a separate statement, but often written joined on the same line with a semicolon) and run that nested block. Otherwise it would run the else. Hard-linking the test program as both "test
" and "[
" gave the notational advantage of the square brackets and the appearance that the functionality of test was part of the sh language. sh's use of a reversed keyword to mark the end of a control block was a style borrowed from ALGOL 68.[7]
By contrast, csh could evaluate the expression directly, which made it faster. It also claimed better readability: Its expressions used a grammar and a set of operators mostly copied from C, none of its keywords were reversed and the overall style was also more like C.
Here is a second example, comparing scripts that calculate the first 10 powers of 2.
#!/bin/sh i=2 j=1 while [ $j -le 10 ]; do echo '2 **' $j = $i i=`expr $i '*' 2` j=`expr $j + 1` done |
#!/bin/csh set i = 2 set j = 1 while ( $j <= 10 ) echo '2 **' $j = $i @ i *= 2 @ j++ end |
Again because of the lack of an expression grammar, the sh script uses command substitution and the expr command. The @ statement in C shell is a pun: it's the "at-sign-ment" statement.
Finally, here's a third example, showing the differing styles for a switch statement.
#!/bin/sh for i in d* do case $i in d?) echo $i is short ;; *) echo $i is long ;; esac done |
#!/bin/csh foreach i ( d* ) switch ( $i ) case d?: echo $i is short breaksw default: echo $i is long endsw end |
In the sh script, ";;
" marks the end of each case. That's distinguished because sh disallows null statements otherwise.
The second objective was that the C shell should be better for interactive use. In support of that, it introduced numerous new features that made it easier, faster and more friendly to use by someone sitting at a terminal, typing commands. Users could get things done with a lot fewer keystrokes and it ran faster. The most significant of these new features were the history and editing mechanisms, aliases, directory stacks, tilde notation, cdpath, job control and path hashing. These new features proved very popular and many of them have since been copied by other Unix shells.
!!
", typed as a command and referred to as "bang, bang," means run the immediately preceding command. Other short keystroke combinations, e.g., "!$
" to mean just the last argument of the previous command, allow bits and pieces of previous commands to be pasted together and edited to form a new command.fgrep
" command by typing "f
", it runs faster and it's more convenient than creating a script.~
" character.cd
(change directory) command: If the specified directory isn't in the current directory, csh will try to find it in the cdpath directories.^Z
. The user could then switch back-and-forth between jobs using the fg
command. The active job was said to be in the foreground. Other jobs were said to be either suspended (stopped) or running in the background.The C shell operates line-at-a-time. Each line is tokenized into a set of words separated by spaces or other characters with special meaning, including parentheses, the piping and i/o redirection operators and the semicolon or ampersand.
A basic statement is one that simply runs a command. The first word is taken as name of the command to be run and may be either an internal command, e.g., "echo
," or an external command. The rest of the words are passed as arguments to the command.
At the basic statement level, here are some of the features of the grammar:
*
matches any number of characters.?
matches any single character.[
...]
matches any of the characters inside the square brackets. Ranges are allowed, using the hyphen.[!
...]
matches any character not in the set.{
def,
ghi}
is alternation and expands to abcdef and abcghi.~
means the current user's home directory.~
user means user's home directory.*/*.c
", are supported.main()
is invoked) because of its MS-DOS heritage: MS-DOS only allowed a 128-byte command line to be passed to an application, making wildcarding by the DOS command prompt impractical.>
file means stdout will be written to file, overwriting it if it exists, and creating it if it doesn't. Errors still come to the shell window.>&
file means both stdout and stderr will be written to file, overwriting it if it exists, and creating it if it doesn't.>>
file means stdout will be appended at the end of file.>>&
file means both stdout and stderr will be appended at the end of file.<
file means stdin will be read from file.<<
string is a here document. Stdin will read the following lines up to the one that matches string.;
means run the first command and then the next.&&
means run the first command and, if it succeeds with a 0 return code, run the next.||
means run the first command and, if it fails with a non-zero return code, run the next.|
means connect stdout to stdin of the next command. Errors still come to the shell window.|&
means connect both stdout and stderr to stdin of the next command.$
", the following characters are taken as the name of a variable and the reference is replaced by the value of that variable. Various editing operators, typed as suffixes to the reference, allow pathname editing (e.g., to extract just the extension) and other operations.\
means take the next character as an ordinary literal character."
string"
is a weak quote. Enclosed whitespace and wildcards are taken as literals, but variable and command substitutions are still performed.'
string'
is a strong quote. The entire enclosed string is taken as a literal.`
command`
means take the output of command, parse it into words and paste them back into the command line.&
means start command in the background and prompt immediately for a new command.(
commands )
means run commands in a subshell.The C shell provides control structures for both condition-testing and iteration. The condition-testing control structures are the if and switch statements. The iteration control structures are the while, foreach and repeat statements.
The C shell implements both shell and environment variables. Environment variables, created using the setenv statement, are always simple strings, passed to any child processes, which retrieve these variables via the envp[]
argument to main()
.
Shell variables, created using the set or @ statements, are internal to C shell. They are not passed to child processes. Shell variables can be either simple strings or arrays of strings. Some of the shell variables are predefined and used to control various internal C shell options, e.g., what should happen if a wildcard fails to match anything.
In current versions of csh, strings can be of arbitrary length, well into millions of characters.
The C shell implements a 32-bit integer expression grammar with operators borrowed from C but with a few additional operators for string comparisons and filesystem tests, e.g., testing for the existence of a file. Operators must be separated by whitespace from their operands. Variables are referenced as $
name.
Operator precedence is also borrowed from C, but with different operator associativity rules to resolve the ambiguity of what comes first in a sequence of equal precedence operators. In C, the associativity is left-to-right for most operators; in C shell, it's right-to-left. For example,
// C groups from the left // prints 4 int i = 10 / 5 * 2; printf( "%d\n", i ); // prints 5 i = 7 - 4 + 2; printf( "%d\n", i ); // prints 16 i = 2 >> 1 << 4; printf( "%d\n", i ); |
# C shell groups from the right # prints 1 @ i = 10 / 5 * 2 echo $i # prints 1 @ i = 7 - 4 + 2 echo $i # prints 0 @ i = ( 2 >> 1 << 4 ) echo $i |
The parentheses in the C shell example are to avoid having the bit-shifting operators confused as I/O redirection operators. In either language, parentheses can always be used to explicitly specify the desired order of evaluation, even if only for clarity.
Though popular for interactive use because of its many innovative features, csh has never been as popular for scripting. Initially, and through the 1980s, csh couldn't be guaranteed to be present on all Unix systems. sh could, which made it a better choice for any scripts that might have to run on other machines. By the mid-1990s, csh was widely available, but the use of csh for scripting faced new criticism by the POSIX committee,[8] which specified there should only be one preferred shell for both interactive and scripting purposes and that that one preferred shell should be the Korn Shell. C shell also faced criticism from others[9][10] over the C shell's alleged defects in the syntax, missing features and poor implementation.
Syntax defects were generally simple but unnecessary inconsistencies in the definition of the language. For example, the set
, setenv
and alias
commands all did basically the same thing, namely, associate a name with a string or set of words. But all three had slight, but completely unnecessary differences. An equal sign was required for a set
but not for setenv
or alias
; parentheses were required around a word list for a set
but not for setenv
and alias
, etc. Similarly, the if
, switch
and the looping constructs use needlessly different keywords (endif
, endsw
and end
) to terminate the nested blocks.
Missing features most commonly cited are the lack of ability to manipulate the stdio file handles independently and support for functions. Whereas Bourne shell functions lacked only local variables, Csh's aliases - the closest analogue in Csh to functions - were restricted to single lines of code, despite most flow control constructs requiring newlines to be recognized. As a result, Csh scripts could not be functionally broken down as C programs themselves could be, and larger projects tended to shift to either Bourne shell scripting or C code instead.
The implementation, which used an ad hoc parser, has drawn the most serious criticism. By the early 1970s, compiler technology was sufficiently mature[11] that most new language implementations used either a top-down or bottom-up parser capable of recognizing a fully recursive grammar. It's not known why an ad hoc design was chosen instead for the C shell. It may be simply that, as Joy put it in an interview in 2009, "When I started doing this stuff with Unix, I wasn't a very good programmer."[12] But that choice of an ad hoc design meant that the C shell language was not fully recursive. There was a limit to how complex a command it could handle.
It worked for most things users typed interactively, but on the more complex commands a user might take time to write in a script it didn't work well and could easily fail, producing only a cryptic error message or an unwelcome result. For example, the C shell could not support piping between control structures. Attempting to pipe the output of a foreach
command into grep
simply didn't work. (The work-around, which works for many of the complaints related to the parser, is to break the code up into separate scripts. If the foreach
is moved to a separate script, piping works because scripts are run by forking a new copy of csh that does inherit the correct stdio handles.)
Another example is the unwelcome behavior in the following fragments. Both of these appear to mean, "If 'myfile' does not exist, create it by writing 'mytext' into it." But the version on the right always creates an empty file because the C shell's order of evaluation is to look for and evaluate I/O redirection operators on each command line as it reads it, before examining the rest of the line to see if contains a control structure.
# Works as expected if ( ! -e myfile ) then echo mytext > myfile endif |
# Always creates an empty file if ( ! -e myfile ) echo mytext > myfile |
The implementation is also criticized for its notoriously poor error messages, e.g., "0 event not found", which yields no information about what the problem is.
The C shell was extremely successful in introducing a large number of ideas including the history mechanism, aliases, tilde notation, interactive filename completion, an expression grammar built into the shell, etc., that have since been copied by other Unix shells. But in contrast to sh, which has spawned a large number of independently-developed clones, including ksh and bash, only two csh clones are known. (tcsh is not considered a clone of csh because it was not independently-developed. tcsh was based on the same code originally written by Bill Joy and only added features.)
In 1986, Allen Holub wrote, On Command: Writing a Unix-Like Shell for MS-DOS,[13] a book describing a program he'd written called "SH" but which in fact copied the language design and features of csh, not sh. Companion diskettes containing full source for SH and for a basic set of Unix-like utilities (cat, cp, grep, etc.) were available for $25 and $30, respectively, from the publisher. The control structures, expression grammar, history mechanism and other features in Holub's SH were identical to those of the C shell.
In 1988, Hamilton Laboratories began shipping Hamilton C shell for OS/2.[14] In 1992, Hamilton C shell was released for Windows NT.[15] The Windows version continues to be actively supported[16] but the OS/2 version was discontinued in 2003. Hamilton C shell was written by Nicole Hamilton and includes both a csh clone and a set of Unix-like utilities. An early quick reference[17] described the intent as "full compliance with the entire C shell language (except job control)" but with improvements to the language design and adaptation to the differences between Unix and a PC. The most important improvement was a top-down parser that allowed control structures to be nested or piped, something the original C shell could not support, given its ad hoc parser. Hamilton also added new language features including built-in and user-defined procedures, block-structured local variables and floating point arithmetic. Adaptation to a PC included support for the filename and other conventions on a PC and the use of threads instead of fork (which wasn't available under either OS/2 or Windows) to achieve parallelism, e.g., in setting up a pipeline.
|